Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dd: use an alarm thread instead of elapsed() calls #4447

Merged
merged 5 commits into from
Jun 19, 2023

Conversation

Freaky
Copy link
Contributor

@Freaky Freaky commented Feb 28, 2023

dd's main IO loop makes an Instant::elapsed call for every block copied for updating the display thread. When I implemented status=progress support in FreeBSD's dd I used a SIGALRM to set a flag in much the same way its SIGINFO support worked, which avoids all those timekeeping calls.

This isn't portable, but a similar approach can be used with a thread in a sleep loop and an atomic flag. I knocked one together and ran a comparison with FreeBSD/amd64 13.1-RELEASE on an AMD 5700X:

Benchmark 1: /bin/dd if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     733.9 ms ±  87.4 ms    [User: 147.9 ms, System: 585.9 ms]
  Range (min … max):   638.9 ms … 837.4 ms    100 runs

Benchmark 2: ./dd-no-timekeeping if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     797.9 ms ±  84.5 ms    [User: 202.4 ms, System: 595.5 ms]
  Range (min … max):   690.6 ms … 878.4 ms    100 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 3: ./dd-thread-alarm if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     808.7 ms ±  83.9 ms    [User: 209.8 ms, System: 598.9 ms]
  Range (min … max):   718.8 ms … 912.3 ms    100 runs

Benchmark 4: ./dd-baseline if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     904.4 ms ±  83.3 ms    [User: 327.5 ms, System: 576.9 ms]
  Range (min … max):   814.4 ms … 1010.4 ms    100 runs

Summary
  '/bin/dd if=/dev/zero of=/dev/null count=4000000 status=progress' ran
    1.09 ± 0.17 times faster than './dd-no-timekeeping if=/dev/zero of=/dev/null count=4000000 status=progress'
    1.10 ± 0.17 times faster than './dd-thread-alarm if=/dev/zero of=/dev/null count=4000000 status=progress'
    1.23 ± 0.19 times faster than './dd-baseline if=/dev/zero of=/dev/null count=4000000 status=progress'

dd-no-timekeeping has the elapsed() call completely elided and doesn't respect status=progress, dd-thread-alarm is this patch, and dd-baseline is unmodified uutils. It would be interesting to see if this holds up on other platforms and other situations.

I just messily dumped Alarm near the top of the file for now.

Quick benchmark on FreeBSD 13.1-RELEASE:

Summary
  './dd-thread-alarm if=/dev/zero of=/dev/null count=4000000 status=progress' ran
  1.17 ± 0.17 times faster than './dd-baseline if=/dev/zero of=/dev/null count=4000000
        status=progress'
@Freaky
Copy link
Contributor Author

Freaky commented Feb 28, 2023

One small change in behaviour I observe is the initial progress update is now delayed by a second. This matches both FreeBSD and GNU dd behaviour, and that first report after a single block has been copied isn't very useful anyway.

@tertsdiepraam
Copy link
Member

Cool! It's a nice solution. In your experiments, dd is the FreeBSD dd instead of GNU, right? (Just to confirm if we get different numbers on other platforms)

One small change in behaviour I observe is the initial progress update is now delayed by a second.

This should be fine. If we really want to we could just send an empty update before the loop if we want to create the progress bar early.

@Freaky
Copy link
Contributor Author

Freaky commented Feb 28, 2023

Yes, my tests are against BSD dd. I do have GNU coreutils 9.1 installed, and it looks to roughly match uutils baseline:

Benchmark 1: /bin/dd if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     725.6 ms ±  85.7 ms    [User: 152.2 ms, System: 573.3 ms]
  Range (min … max):   638.5 ms … 833.8 ms    100 runs

Benchmark 2: /usr/local/bin/gdd if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     916.7 ms ±  84.9 ms    [User: 325.0 ms, System: 591.6 ms]
  Range (min … max):   812.3 ms … 1001.1 ms    100 runs

Benchmark 3: ./dd-thread-alarm if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     816.4 ms ±  84.1 ms    [User: 216.0 ms, System: 600.4 ms]
  Range (min … max):   720.4 ms … 921.7 ms    100 runs

Benchmark 4: ./dd-baseline if=/dev/zero of=/dev/null count=4000000 status=progress
  Time (mean ± σ):     906.7 ms ±  87.6 ms    [User: 332.3 ms, System: 574.4 ms]
  Range (min … max):   813.1 ms … 1035.6 ms    100 runs

Summary
  '/bin/dd if=/dev/zero of=/dev/null count=4000000 status=progress' ran
    1.13 ± 0.18 times faster than './dd-thread-alarm if=/dev/zero of=/dev/null count=4000000 status=progress'
    1.25 ± 0.19 times faster than './dd-baseline if=/dev/zero of=/dev/null count=4000000 status=progress'
    1.26 ± 0.19 times faster than '/usr/local/bin/gdd if=/dev/zero of=/dev/null count=4000000 status=progress'

And indeed it looks to follow the same approach as baseline:

      if (status_level == STATUS_PROGRESS)
        {
          xtime_t progress_time = gethrxtime ();
          if (next_time <= progress_time)
            {
              print_xfer_stats (progress_time);
              next_time += XTIME_PRECISION;
            }
        }

@tertsdiepraam
Copy link
Member

Thanks for checking! This approach seems to be similar to what indicatif is doing for their progress bars too, so I think this makes sense.

trigger: Arc<AtomicBool>,
}

impl Alarm {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice if you could add some documentations / comment here :)

@sylvestre
Copy link
Contributor

Sorry for the latency. Could you please add your benchmark in src/uu/dd/BENCHMARKING.md if it isn't covered? thanks

@Freaky
Copy link
Contributor Author

Freaky commented May 3, 2023

Alarm seems to also be a natural solution for SIGINFO/SIGUSR support, removing the delay that's currently in the Linux path because it's only checked on interval updates.

@Freaky Freaky marked this pull request as ready for review May 5, 2023 14:51
@uutils uutils deleted a comment from github-actions bot Jun 11, 2023
@uutils uutils deleted a comment from github-actions bot Jun 11, 2023
Copy link
Collaborator

@jfinkels jfinkels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check the code out locally, but this makes sense to me

let trigger = Arc::new(AtomicBool::default());

let weak_trigger = Arc::downgrade(&trigger);
thread::spawn(move || {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now if I understand correctly, there are three threads: the main thread, this thread which has a periodic alarm, and the progress output thread. Is that right?

@sylvestre sylvestre merged commit 66723e0 into uutils:main Jun 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants